perm filename SUP[HPP,DBL]1 blob
sn#195031 filedate 1976-01-05 generic text, type C, neo UTF8
COMMENT ⊗ VALID 00009 PAGES
C REC PAGE DESCRIPTION
C00001 00001
C00002 00002 .DEVICE XGP
C00004 00003 5↓_Contents_↓*
C00007 00004 5↓_1. Overview_↓*
C00023 00005 5↓_2. Tutorial Talk_↓*
C00026 00006 5↓_3. Details Talk_↓*
C00029 00007 5↓_4a. Glossary of Math Terms_↓*
C00045 00008 5↓_4b. Glossary of AI Terms_↓*
C00061 00009 5↓_5. Documentation_↓*
C00063 ENDMK
C⊗;
.DEVICE XGP
.FONT 1 "BASL30"
.FONT 2 "BASB30"
.FONT 4 "BASI30"
.FONT 5 "BDR40"
.FONT 6 "NGR25"
.FONT 7 "NGR20"
.FONT A "SUP"
.FONT B "SUB"
.TURN ON "↑α↓_π[]{"
.TURN ON "⊗" FOR "%"
.PAGE FRAME 54 HIGH 80 WIDE
.AREA TEXT LINES 4 TO 53
.COUNT PAGE PRINTING "1"
.TABBREAK
.ODDLEFTBORDER←EVENLEFTBORDER←1000
.AT "ffi" ⊂ IF THISFONT ≤ 4 THEN "≠" ELSE "fαfαi" ⊃;
.AT "ffl" ⊂ IF THISFONT ≤ 4 THEN "α∞" ELSE "fαfαl" ⊃;
.AT "ff" ⊂ IF THISFONT ≤ 4 THEN "≥" ELSE "fαf" ⊃;
.AT "fi" ⊂ IF THISFONT ≤ 4 THEN "α≡" ELSE "fαi" ⊃;
.AT "fl" ⊂ IF THISFONT ≤ 4 THEN "∨" ELSE "fαl" ⊃;
.MACRO B ⊂ BEGIN VERBATIM GROUP ⊃
.MACRO B1 ⊂ BEGIN PREFACE 0 INDENT 0 SINGLE SPACE ⊃
.MACRO E ⊂ APART END ⊃
.MACRO FAD ⊂ FILL ADJUST DOUBLE SPACE PREFACE 2 ⊃
.MACRO FAS ⊂ FILL ADJUST SINGLE SPACE PREFACE 1 ⊃
.FAS
.PAGE←0
.NEXT PAGE
.INDENT 0
.TURN OFF "{∞→}"
.GROUP SKIP 1
.BEGIN CENTER PREFACE 0
⊗5↓_AUTOMATED MATHEMATICIAN_↓⊗*
⊗5Supplementary Materials⊗*
for the Stanford ⊗2Heuristic Programming Project⊗* Workshop
⊗4January 5-8, 1976⊗*
Douglas B. Lenat
.END
.SELECT 1
.EVERY HEADING(⊗7Douglas B. Lenat⊗*,⊗4↓_Automated Mathematician_↓⊗*,⊗7SUPPLEMENT page {PAGE}⊗*)
⊗5↓_Contents_↓⊗*
.B1
1. Brief global description of the AM project
2. The tutorial talk
3. The details talk
4. Glossary of concepts and terms
5. Existing documentation
.E
.GROUP SKIP 3
⊗5↓_1. Overview_↓⊗*
Researchers in most branches of science frequently face the difficult
task of formulating research problems which must be soluble
yet nontrivial. In any given field, it is usually easier to tackle a
specific given problem than to propose interesting yet managable new
questions to investigate. For example, contrast ⊗4solving⊗* the
Missionaries and Cannibals problem with the more ill-defined
reasoning which led to ⊗4inventing⊗* it.
Let's restrict our attention to
creative theory formation in ⊗4mathematics⊗*:
how to propose interesting new concepts and plausible
hypotheses connecting them.
Although many great minds have introspected on this problem
[Poincare', Hadamard, Polya], we in AI all know the gulf that
separates smooth prose from smooth code.
The experimental vehicle of my research
is a computer program called ⊗2AM⊗* (for ⊗2↓_A_↓⊗*utomated
⊗2↓_M_↓⊗*athematician), which carries out some of the activities involved
in mathematical research: noticing simple relationships in empirical data,
formulating new definitions out of existing ones, proposing some
plausible conjectures (and, less importantly, sometimes proving them),
and evaluating the aesthetic "interestingness" of new concepts.
Before discussing how to ⊗4synthesize⊗* a new theory, consider briefly
how to ⊗4analyze⊗* one, how to construct a plausible chain of
reasoning which terminates in a given discovery. One can do this
by working backwards, by reducing the creative act to simpler and simpler
creative acts.
For example,
consider the concept of prime numbers. How might one be led to
define such a notion? Notice the following plausible strategy:
.GROUP
.ONCE INDENT 6,6,6
"If f is a function which transforms elements of A into
elements of B, and B is ordered, then consider just those members of A
which are transformed into ⊗4extremal⊗* elements of B.
This set is an interesting subset of A."
.APART
When f(x) means "factors of x", and the ordering is "by length", this
heuristic says to consider those numbers which have a minimal↑1 number
of factors -- that is, the primes.
So this rule actually ⊗4reduces⊗* our task from "proposing the concept of
prime numbers" to the more elementary problems of "inventing factorization"
and "discovering cardinality".
But suppose we know this general rule:
"If f is an interesting function, consider its inverse."
It reduces the task of discovering factorization to the simpler
task of discovering multiplication.
Eventually, this task reduces to the discovery of very basic notions, like
substitution, set-union, and
equality. To explain how a given researcher might have made a given
discovery, such an analysis is continued until that inductive task is
reduced to "discovering" notions which the researcher already knew.
Suppose a large collection of these heuristics has been
assembled (e.g., by analyzing a great many discoveries, and writing down
new heuristic rules whenever necessary.)
Instead of using them to ⊗4explain⊗* how a given idea might have
evolved, one can imagine
starting from a basic core of knowledge and
"running" the
heuristics to ⊗4generate⊗* new concepts.
Such syntheses are precisely what AM does.
The program consists of
a corpus of primitive mathematical concepts and a collection of guiding
heuristics.
AM's activities all serve to expand AM itself,↑2 to enlarge upon a
given body of mathematical knowledge.
To cope with the enormity of
the potential "search space" involved,
AM uses its heuristics as judgmental criteria to guide
development in the most promising direction.
It appears that the process of inventing valuable new concepts can be guided
successfully using a collection of a few hundred such heuristics.
Each concept is represented as a ⊗6BEING⊗*↑3, a frame-like data structure
with 30 different
facets or slots.
The types of slots include:
⊗6Examples, Definitions, Generalizations, Utility, Analogies,
Interestingness, Uninterestingness,⊗* and a couple dozen others.
The ⊗6BEINGs⊗*
representation provides a convenient scheme for organizing the
heuristics; for example, the following strategy
fits into the ⊗4Examples⊗* slot of the
⊗4Predicate⊗* concept:
"If, empirically,
10 times as many elements ⊗4fail⊗* some predicate P, as ⊗4satisfy⊗* it, then
some ⊗4generalization⊗* (weakened version) of P might be more interesting than P".
AM considers this suggestion after trying to fill in examples of any predicate.↑4
AM is initially given a large collection of core concepts, with only a few slots
filled in for each. Its sole activity is to choose some facet of
some concept, and fill in that particular slot. In so doing, new
notions will often emerge. Uninteresting ones are forgotten, mildly
interesting ones are kept as parts of one slot of one concept, and
very interesting ones are granted full concept status. Such new
⊗6Beings⊗* will have dozens of blank parts, hence the space of
possible actions (slots to fill in) grows rapidly.
The same heuristics are used both to suggest new directions for
investigation, and to limit attention: both to grow and to prune.
The particular mathematical domains in which AM operates depend on the choice of
initial concepts. Currently, AM is given about a hundred concepts, all of
which are what Piaget might describe as ⊗4prenumerical⊗*:
Sets, substitution, equality, relations, and so on. In particular, AM is not
told anything about proof, single-valued functions,
or numbers. Although it was never able to ⊗4prove⊗* the unique factorization
theorem, AM actually did ⊗4conjecture⊗* it.↑5 Before this, AM had to
define and investigate
concepts corresponding to those we refer to as cardinality,
multiplication, factors, and primes,
based on reasoning similar to that in the example above.
The main difficulty with AM is getting it to accurately judge
⊗4a priori⊗* the value of each new concept, to quickly lose interest
in concepts which aren't going to develop into anything.
As with many AI programs, one aspect of working on AM is the
degree of precision with which one's ideas must be formulated.
The resultant body of detailed heuristics may be the germ of a more efficient
programme for educating math students than the current dogma.↑6
But perhaps the most exciting prospect opened up by AM is that
of experimentation: one could vary the concepts AM starts with, vary the
heurisitics available, etc., and study the effects on AM's behavior.
AM is a dissertation project ⊗4in progress⊗*; few conclusions have been drawn
yet.
The issues to be elaborated upon include:
.B1
(1) What are these heuristics? Where do they come from, what is their
justification, their power?
(2) What is the AM program like? What is its control structure, its
representation for a concept? How do the heuristics fit in?
(3) How does the AM program work? What does it start with, what does it
do from there? How and why?
(4) What can we all learn from AM? Abstracted out, what are the new ideas,
the traps that were fallen into?
.E
.BEGIN GROUP SKIP 1 SELECT 7 INDENT 0,6,0 PREFACE 1 COMPACT
*****************************************************************************************************************
↑1 The other extreme, numbers with a MAXIMAL number of factors, will
also be proposed as worth investigating. This leads to many
interesting questions; the only "new-to-Mankind"
mathematical result so far is
in fact that such maximally-divisible numbers must have the form
p⊗B1⊗*↑a↑1p⊗B2⊗*↑a↑2p⊗B3⊗*↑a↑3...p⊗Bk⊗*↑a↑k, where the
p⊗Bi⊗*'s are the first k consecutive primes, and the exponents a⊗Bi⊗*
decrease with i, and the ratio of (a⊗Bi⊗*+1)/(a⊗Bj⊗*+1) is
approximately (as closely as is possibe for integers)
log(p⊗Bj⊗*)/log(p⊗Bi⊗*). For example, a typical divisor-rich number
is
n=2↑83↑55↑37↑211↑213↑117↑119↑123↑129↑131↑137↑141↑143↑147↑153↑1.
The progression of its exponents+1 (9 6 4 3 3 2 2 2 2 2 2 2 2 2 2 2)
is about as close as one can get to satisfying the "logarithm"
constraint. This number n has 3,981,312
divisors. The "AM Conjecture" is that no number smaller than n has
that many divisors. By the way, this n equals 25,608,675,584.
↑2 Incidentally, these basic concepts include the operators which
enlarge the space (e.g., ⊗6COMPOSE-2-RELATIONS⊗* is both a concept in
its own right and a way to generate new ones).
↑3 Lenat, Douglas, ↓_BEINGS: Knowledge as Interacting Experts_↓, 4th
IJCAI, 1975, pp. 126-133.
↑4 In fact, after AM attempts to find examples of SET-EQUALITY, so
few are found that AM decides to generalize that predicate. The
result is the predicate which means "Has-the-same-length-as" -- i.e.,
the rudiments of Cardinality.
↑5 Due to the firm base of preliminary concepts which AM developed,
this relationship was almost obvious. AM sought some predicate P
which, for each n, some member of FACTORS-OF(n) satisfied.
ALL-PRIMES was such a predicate. AM next constructed the relation
which associates, to each number n, all factorizations of n into
primes. The full statement of the UFT is simply that this relation
is a function; i.e., it is defined and single-valued for all numbers
n.
↑6 Currently, the educator takes the very best work any mathematician
has ever done, polishes it until its brilliance is blinding, and
presents it to the student to induce upon. A few individuals (e.g.,
Minsky and Papert at MIT, Adams at Stanford) are experimenting with
more realistic strategies for "teaching" creativity.
.END
.SKIP TO COLUMN 1
⊗5↓_2. Tutorial Talk_↓⊗*
↓_Objective_↓
Provide a solid introduction to the system for those
unacquainted with it. Also, try to
convey a feeling for what it is like to actually ⊗4do⊗* simple
mathematical research.
↓_Outline_↓
.B1
The task: proposing interesting new problems, concepts to investigate
How one can analyze a given discovery, "explain" it.
How to invert this process and ⊗4generate⊗* new discoveries
Representing math knowledge as cooperating experts
Comments on "theory formation" in general
Some problems that ⊗4you⊗* might encounter.
.E
⊗5↓_3. Details Talk_↓⊗*
↓_Objective_↓
After a brief review of the project, a few detailed issues
(which have analogues in most other projects) will be discussed.
The rest of the time will be spent in group planning/discussion
of AM. Some of these "planning" topics are listed below.
↓_Questions to discuss_↓
.B1
Using multiple representations and languages
Discovering ⊗4vs⊗* being led by the nose.
Philosophical justifications and implications of AM's foundation.
The name of a computer program: pretentious ⊗4vs⊗* meaningless
Choosing a domain for an AI program, or, "reality was never like this"
Several directions for future work. What should be the priorities?
AM next year at Stanford.
.E
↓_Prerequisite concepts_↓
Integers, Natural Numbers, Relations, Functions, Ordering: Just glance
at the Glossary for those items you feel uncomfortable about.
Prime numbers: What are they? Why are they interesting? Are they useful?
(If you are unsure, read the Primes definition in the Glossary).
Cardinality: If our concept of "number" is not primitive, what is it
"built" out of? How do young children learn it? Is it related to the
world around us, to our anatomy, to the size of our short-term-memory?
Mathematical Research ⊗4↓_vs_↓⊗* Finished Mathematics: If you don't perceive the
difference, read the glossary entries for
Mathematical concept, Mathematical theory, Mathematical intuition, and
Mathematical research.
Modular Representations of Knowledge, Cooperating Knowledge Sources,
Heterarchy: Despite their recent popularity in AI circles, many people
have very shaky interpretations of these terms. To see ⊗4my⊗* shaky
interpretations, read the Glossary entries for those 3 notions.
.SKIP TO COLUMN 1
⊗5↓_4a. Glossary of Math Terms_↓⊗*
Cardinality: the concept of "number". Two sets are of the same
cardinality if they have the same number of elements.
Composition of two relations R and S: This is a new relation denoted
R⊗7o⊗*S, and defined as R⊗7o⊗*S(x) = R(S(x)). So R⊗7o⊗*S maps
elements of the domain of S into elements of the range of R. Notice
that if R and S are both functions, then so is R⊗7o⊗*S. The intuitive
picture of this process is to operate on x with the relation S, and
⊗4then⊗* apply R to the results.
Function: an operation f which associates, to each element x of some
set D, an element f(x) of some set R. D and R are the domain and
range of f. Notice that a function may be considered a single-valued
relation.
Integers: positive and negative whole numbers; i.e. ...,-2, -1, 0, 1,
2,...
Map: used as a verb, this word indicates the action of applying a
function or a relation; e.g., we say that ⊗4squaring⊗* maps 7 into
49. Used as a noun, it is a synonym for function.
Mathematical concept: this is taken to mean all the constructions,
definitions, conjectures, operations, structures, etc. that a
mathematician deals with. Some examples: Set-intersection, Sets, The
unique factorization theorem, every entry listed in this glossary.
Mathematical intuition: this is the mental imagery which can be
brought to bear. Typically, we transform the situation to an
abstract, simplified one, manipulate it there, and re-translate the
results into the original notation. For example, our intuition about
"ordering" may involve the image of marks on a yardstick. We can then
answer questions involving ordering rapidly, using this
representation. Three features of the intuitive image should be
noted: (i) it is typically fast and simple, (ii) it is opaque, one
cannot introspect too easily on "why it works", and (iii) it is
fallible, occasionally leading to wrong results.
Mathematical research: The fundamental idea here is that mathematics
is an ⊗4empirical⊗* science, just as much as chemistry or physics.
In doing research, the ultimate goal is the creation of new,
interesting theories, but the techniques used include looking for
patterns in empirical data, inducing new conjectures, modelling some
aspects of the real world, etc. Although the final product looks like
a smooth, formal development, magically flowing from postulates to
lemmas to theorems, the actual research process involved untold blind
alleys, rough guesses, and hard work. (analogy: The process of
painting is rarely itself artistic.)
Mathematical theory: to qualify as a theory, we must have (i) a basis
of undefined primitive terms, (ii) definitions involving these, (iii)
axioms involving all the primitives and defined terms (iv)
conjectures and theorems relating these terms. To be at all
worthwhile, however, the theory must also meet the fuzzy requirements
that (v) there is some correspondence between the primitives and some
"real-world" concepts, between the axioms and some "real"
relationships, and (vi) some of the theorems are unexpected, hard to
prove, elegant, interesting, etc.
Natural numbers: non-negative integers; i.e., 0, 1, 2, 3,...
Ordering: the concept of "before" and "after". This distinguishes a
list from a bag (multiset). The formal axioms for ordering simply
state the obvious properties of the intuitive image of a list.
Prime numbers: natural numbers which have no divisors other than 1
and themself; e.g., 17, but ⊗4not⊗* 15 (=3x5). Primes are interesting
because of the myriad times they crop up in diverse theorems -- from
the Chinese Remainder Theorem (solving systems of linear congruence
equations), to the Law of Quadratic Reciprocity, to Fermat's Theorem
(for all integers n, for all primes p, n↑p is congruent to n (mod
p)). The "secret" of their value lies in the fact that all integers
can be factored ⊗4uniquely⊗* into a set of prime divisors. This
"Unique Factorization Theorem" lets us reduce questions about
integers to questions about primes.
Relation: an operation which associates, for each element of some set
D, a set of elements E = {e↓1, e↓2,...} of some set R. D and R are
the domain and range of the relation. For example, the realtion
"⊗6≤⊗*" associates to 5 the set of numbers {5, 6, 7, 8,...} -- i.e.,
all integers which 5 is less than or equal to. The domain and range
of this relation are the integers.
⊗5↓_4b. Glossary of AI Terms_↓⊗*
ACTORs: A modular form of representation, useful for distributing of
the task of ⊗4control⊗* among several components in a computer
program. Each ACTOR is a black box, with no parts or slots, but which
does have some assertions (a "contract") which he must honor. It
merely responds to a fixed set of messages, by sending out certain
messages of his own. These are delivered via a bureaucracy.
Recursive sending is permitted.
BEINGs: A modular form of representation of knowledge as a collection
of cooperating experts. Each module is a list of
Question/Answering-program pairs, where the set of questions is fixed
for all the Beings in the system. When any Being has a question, he
broadcasts it to the entire system, and some Being who recognizes it
will take over control and try to answer it by running ⊗4his⊗*
appropriate Answering-program. In the process of running this, some
new questions may arise. Notice that Beings distribute responsibility
for control and for static knowledge. The advantages of having each
BEING composed of the same structure, the same names for its "slots",
are (i) efficient communication between Beings, and (ii) easy
creation of and "filling out" of brand new Beings.
Cooperating Knowledge Sources: Very often, in tackling a problem, one
receives some hints and some constraints from very different sources,
phrased in very different languages, often addressing different
representations of the problem. For example, in trying understand a
human speaker, our memory of the previous discussion and knowledge of
the speaker may narrow down the possible ⊗4meanings⊗* of what he is
saying. Our ears, of course, register the precise acoustic wave-forms
he is uttering. Our English vocabulary forces us to interpret
imperfect signals as real words. Our eyes see his gestures and his
lip movements, and give us more information. All these different
sources of information must be used, and yet they all are talking in
different "languages" to us. The most trivial solution is to keep
all the sources independent, and keep working until one of them can
solve the problem all by itself. A much better solution is to
transform all their babblings into one canonical representation, one
single language. There are in fact no more profound ideas around yet
on this "interfacing" problem.
FRAMEs: A modular representation of knowledge. Each module is a list
of Feature/Value pairs. The ⊗4value⊗* represents a default assumption
which can be relied on until/unless new information comes in abut
that feature. Each frame has whatever ⊗4features⊗* (called "slots")
seem appropriate. Whenever a situation S is encountered, the
frame(s) for S are activated. As new information rolls in, it
replaces the default information in various slots. Notice the
emphasis on distributing static knowledge (⊗4data⊗*), not necessarily
control, in such a system.
Heterarchy: This term refers to the control structure of a computer
program. The typical hierarchical structure is one in which a function
calls a subroutine, which processes and then returns a value to that function.
A program is viewed as a tree structure, with lines indicating "calling".
Heterarchical structuring views the whole program as a collection
of equal partners, an unstructured set of functions.
"Control" is viewed as a spotlight,
which can be flicked from one function to another. The functions can
affect who does or doesn't get control next, but there is no
guarantee who will get control, or that control will revert back to
some function which once had it. Aside from the lure of its democratic flavor,
it is clearly a natural way to represent cooperating knowledge modules.
Modular Representations of Knowledge in AI Systems: (1) Definition:
Knowledge is partitioned into packets (called modules, frames, units,
experts, actors) along lines of: different applicabilities,
expertise, purpose, importance, generality, etc. Each packet is
structurally similar to all the rest. (2) Advantages: By having the
knowledge discretized, pieces can be added and/or removed with no
trouble. The knowledge of the system is easily inspected and
analyzed. The structural similarity yields several advantages: a
simple control system suffices to "run" all the knowledge, the
modules can intercommunicate easily, new modules can be inserted
without knowing precisely "who else" is already in the system. (3)
Examples: Some modular schemes (and their program incarnations) are:
Actors (Plasma), Frames, Beings (PUP6), Production Systems (PSG,
Dendral, Mycin), Predicate Calculus. (4) Relation to "Cooperating
Knowledge Sources" Although modular representation is a natural way
to implement cooperating knowledge sources, the two concepts are
distinct. For example, Hearsay uses opaque modules, which do ⊗4not⊗*
have similar structures, who communicate via a global blackboard. In
general, if the modules are to have non-standard structures, then the
inter-communication media must be a simple scheme (like a global
assertional data base, a blackboard).
⊗5↓_5. Documentation_↓⊗*
.BEGIN INDENT 0,3,0 PREFACE 0
1. Thesis Proposal: SU-AI file SYS4[TLK,DBL]
2. This supplementary file: SU-AI file SUP[HPP,DBL]
3. The text of the tutorial: SU-AI file TUT[HPP,DBL]
4. The text of the planning talk: SU-AI file DET[HPP,DBL]
5. The system itself: SUMEX files <LENAT> TOP6, CON6, and UTIL6.
To run: get into LISP, load <LENAT>L, follow instructions.
6. The use of BEINGS representation in AM is described in the paper:
⊗4Duplication of Human Actions by an Interacting Community of Knowledge Modules⊗*,
Proceedings of the Third International Congress of Cybernetics and Systems,
Bucharest, Roumania, August, 1975.
7. An English-like description of the heuristics for each facet of each
concept can be perused as SU-AI file GIVEN[TLK,DBL].
.END